<!DOCTYPE html>

<html>

  <head>
    <title>Underactuated Robotics: Acrobots, Cart-Poles, and Quadrotors</title>
    <meta name="Underactuated Robotics: Acrobots, Cart-Poles, and Quadrotors" content="text/html; charset=utf-8;" />
    <link rel="canonical" href="http://underactuated.mit.edu/acrobot.html" />

    <script src="https://hypothes.is/embed.js" async></script>
    <script type="text/javascript" src="htmlbook/book.js"></script>

    <script src="htmlbook/mathjax-config.js" defer></script> 
    <script type="text/javascript" id="MathJax-script" defer
      src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml.js">
    </script>
    <script>window.MathJax || document.write('<script type="text/javascript" src="htmlbook/MathJax/es5/tex-chtml.js" defer><\/script>')</script>

    <link rel="stylesheet" href="htmlbook/highlight/styles/default.css">
    <script src="htmlbook/highlight/highlight.pack.js"></script> <!-- http://highlightjs.readthedocs.io/en/latest/css-classes-reference.html#language-names-and-aliases -->
    <script>hljs.initHighlightingOnLoad();</script>

    <link rel="stylesheet" type="text/css" href="htmlbook/book.css" />
  </head>

<body onload="loadChapter('underactuated');">

<div data-type="titlepage">
  <header>
    <h1><a href="index.html" style="text-decoration:none;">Underactuated Robotics</a></h1>
    <p data-type="subtitle">Algorithms for Walking, Running, Swimming, Flying, and Manipulation</p> 
    <p style="font-size: 18px;"><a href="http://people.csail.mit.edu/russt/">Russ Tedrake</a></p>
    <p style="font-size: 14px; text-align: right;"> 
      &copy; Russ Tedrake, 2020<br/>
      <a href="tocite.html">How to cite these notes</a> &nbsp; | &nbsp;
      <a target="_blank" href="https://docs.google.com/forms/d/e/1FAIpQLSesAhROfLRfexrRFebHWLtRpjhqtb8k_iEagWMkvc7xau08iQ/viewform?usp=sf_link">Send me your feedback</a><br/>
    </p>
  </header>
</div>

<p><b>Note:</b> These are working notes used for <a
href="http://underactuated.csail.mit.edu/Spring2020/">a course being taught
at MIT</a>. They will be updated throughout the Spring 2020 semester.  <a 
href="https://www.youtube.com/channel/UChfUOAhz7ynELF-s_1LPpWg">Lecture  videos are available on YouTube</a>.</p> 

<table style="width:100%;"><tr style="width:100%">
  <td style="width:33%;text-align:left;"><a class="previous_chapter" href=pend.html>Previous Chapter</a></td>
  <td style="width:33%;text-align:center;"><a href=index.html>Table of contents</a></td>
  <td style="width:33%;text-align:right;"><a class="next_chapter" href=simple_legs.html>Next Chapter</a></td>
</tr></table>


<!-- EVERYTHING ABOVE THIS LINE IS OVERWRITTEN BY THE INSTALL SCRIPT -->
<chapter style="counter-reset: chapter 2">
  <h1>Acrobots, Cart-Poles, and Quadrotors</h1>

  <p> A great deal of work in the control of underactuated systems has been
  done in the context of low-dimensional model systems.  These model   systems
  capture the essence of the problem without introducing all of   the complexity
  that is often involved in more real-world examples. In   this chapter we will
  focus on two of the most well-known and   well-studied model systems--the
  Acrobot and the Cart-Pole.  After we have developed some tools, we will see
  that they can be applied directly to other model systems; we will give a
  number of examples using Quadrotors.  All of these systems are trivially
  underactuated, having less actuators than degrees of freedom.</p>

  <section><h1>The Acrobot</h1>

    <p> The Acrobot is a planar two-link robotic arm in the vertical plane
    (working against gravity), with an actuator at the elbow, but no actuator at
    the shoulder.  It was first described in detail in <elib>Murray91</elib>.
    The companion system, with an actuator at the shoulder but not at the elbow,
    is known as the Pendubot<elib>Spong97</elib>.  The Acrobot is so named
    because of its resemblance to a gymnast (or acrobat) on a parallel bar, who
    controls his/her motion predominantly by effort at the waist (and not effort
    at the wrist).  The most common control task studied for the Acrobot is the
    swing-up task, in which the system must use the elbow (or waist) torque to
    move the system into a vertical configuration then balance. </p>

    <figure>
    <img width="40%" src="figures/acrobot.svg"/>
    <figcaption>The Acrobot.  <a href="http://youtu.be/FeCwtvrD76I" target="_blank">Click here to see a physical Acrobot swing up and balance</a>.</figcaption>
    </figure>

    <p> The Acrobot is representative of the primary challenge in underactuated
    robots.  In order to swing up and balance the entire system, the controller
    must reason about and exploit the state-dependent coupling between the
    actuated degree of freedom and the unactuated degree of freedom.  It is also
    an important system because, as we will see, it closely resembles one of the
    simplest models of a walking robot.</p>

    <subsection><h1>Equations of motion</h1>

      <p> Figure 3.1 illustrates the model parameters used in our analysis.
      $\theta_1$ is the shoulder joint angle, $\theta_2$ is the elbow (relative)
      joint angle, and we will use $\bq = [\theta_1,\theta_2]^T$, $\bx =
      [\bq,\dot\bq]^T$.  The zero configuration is with both links pointed
      directly down.  The moments of inertia, $I_1,I_2$ are taken about the
      pivots<!--<elib>Spong95</elib> uses the center of mass, which differs only
      by an extra term in each inertia from the parallel axis theorem.-->.  The
      task is to stabilize the unstable fixed point $\bx = [\pi,0,0,0]^T$.</p>

      <p>
      We will derive the equations of motion for the Acrobot using the method of
      Lagrange.  The locations of the center of mass of each link, $\bp_{c1},
      \bp_{c2},$ are given by the kinematics: \begin{equation} \bp_{c1} =
      \begin{bmatrix} l_{c1} s_1 \\ -l_{c1} c_1 \end{bmatrix}, \quad \bp_{c2} =
      \begin{bmatrix} l_1 s_1 + l_{c2} s_{1+2} \\ -l_1 c_1 - l_{c2} c_{1+2}
      \end{bmatrix} . \end{equation} The energy <!--\footnote{The complicated
      expression for $T_2$ comes from $T_2 = \frac{1}{2} m_2 v_{c2}^2 +
      \frac{1}{2}I_{c2} (\dot{q}_1 + \dot{q}_2)^2$, where $v_{c2}$ is the
      velocity of the center of mass of the second link and $I_{c2} = I_2 - m_2
      l_{c2}^2$ is the moment of inertia about that point.  Use $c_1 c_{1+2} +
      s_1 s_{1+2} = c_2$ to simplify.}--> is given by: \begin{gather} T = T_1 +
      T_2, \quad T_1 = \frac{1}{2} I_1 \dot{q}_1^2 \\ T_2 = \frac{1}{2} ( m_2
      l_1^2 + I_2 + 2 m_2 l_1 l_{c2} c_2 ) \dot{q}_1^2 + \frac{1}{2} I_2
      \dot{q}_2^2 + (I_2 + m_2 l_1 l_{c2} c_2) \dot{q}_1 \dot{q}_2 \\ 
      <!-- from expanding sum of point masses --> U = -m_1 g l_{c1} c_1 - m_2 g (l_1 c_1 +
      l_{c2} c_{1+2}) \end{gather} Entering these quantities into the Lagrangian
      yields the equations of motion: \begin{gather} (I_1 + I_2 + m_2 l_1^2 +
      2m_2 l_1 l_{c2} c_2) \ddot{q}_1 + (I_2 + m_2 l_1 l_{c2} c_2)\ddot{q}_2 -
      2m_2 l_1 l_{c2} s_2 \dot{q}_1 \dot{q}_2 \\ \quad -m_2 l_1 l_{c2} s_2
      \dot{q}_2^2 + m_1 g l_{c1}s_1 + m_2 g (l_1 s_1 + l_{c2} s_{1+2}) = 0 \\
      (I_2 + m_2 l_1 l_{c2} c_2) \ddot{q}_1 + I_2 \ddot{q}_2 + m_2 l_1 l_{c2}
      s_2 \dot{q}_1^2 + m_2 g l_{c2} s_{1+2} = \tau \end{gather} In standard,
      manipulator equation form: $$\bM(\bq)\ddot\bq + \bC(\bq,\dot\bq)\dot\bq =
      \btau_g(\bq) + \bB\bu,$$ using $\bq = [\theta_1,\theta_2]^T$, $\bu = \tau$
      we have: \begin{gather} \bM(\bq) = \begin{bmatrix} I_1 + I_2 + m_2 l_1^2 +
      2m_2 l_1 l_{c2} c_2 & I_2 + m_2 l_1 l_{c2} c_2 \\ I_2 + m_2 l_1 l_{c2} c_2
      & I_2 \end{bmatrix},\label{eq:Hacrobot}\\ \bC(\bq,\dot{\bq}) =
      \begin{bmatrix} -2 m_2 l_1 l_{c2} s_2 \dot{q}_2 & -m_2 l_1 l_{c2} s_2
      \dot{q}_2 \\ m_2 l_1 l_{c2} s_2 \dot{q}_1 & 0 \end{bmatrix}, \\
      \btau_g(\bq) = \begin{bmatrix} -m_1 g l_{c1}s_1 - m_2 g (l_1 s_1 +
      l_{c2}s_{1+2}) \\ -m_2 g l_{c2} s_{1+2} \end{bmatrix}, \quad \bB =
      \begin{bmatrix} 0 \\ 1 \end{bmatrix}. \end{gather}
      </p>
      <!--
      Solving directly for the accelerations, we have
      \begin{gather}
      \det(\bM) = I_1 I_2 + m_2 l_1^2 (1 - m_2 l_{c2}^2 c_2^2)
      \\
      \ddot{q}_1 = \frac{1}{det(\bM)} \left[ I_2 \left( m_2 l_1 l_{c2} s_2
        (2 \dot{q}_1 + \dot{q}_2) \dot{q}_2 - (m_1 l_{c1} + m_2 l_1) g s_1 -
        m_2 g l_{c2} s_{1+2} \right) + (I_{c2} + m_2 l_1 l_{c2} c_2) \left( -m_2
        l_1 l_{c2} s_2 \dot{q}_1^2 - m_2 g l_{c2} s_{1+2} \right)
        \right]
      \end{gather} -->

      <example><h1>The Acrobot in Python</h1>

        <p>You can experiment with the Acrobot dynamics in <drake></drake>
        using, e.g.</p>

        <jupyter>examples/acrobot/constant_torque.ipynb</jupyter>

      </example>

    </subsection> <!-- acrobot eqs of motion -->

  </section> <!-- acrobot -->

  <section id="cart_pole"><h1>The Cart-Pole system</h1>

    <p> The other model system that we will investigate here is the cart-pole
    system, in which the task is to balance a simple pendulum around its
    unstable equilibrium, using only horizontal forces on the cart.  Balancing
    the cart-pole system is used in many introductory courses in control,
    including 6.003 at MIT, because it can be accomplished with simple linear
    control (e.g. pole placement) techniques.  In this chapter we will consider
    the full swing-up and balance control problem, which requires a full
    nonlinear control treatment.</p>

    <figure>
      <img width="50%" src="figures/cartpole.svg"/>
      <figcaption>The Cart-Pole system.  Click <a href="http://youtu.be/Bzq96V1yN5k">here to see a real robot</a>.</figcaption>
      <todo> add swing-up + balance swf</todo>
    </figure>

    <p> The figure shows our parameterization of the system. $x$   is the
    horizontal position of the cart, $\theta$ is the   counter-clockwise angle
    of the pendulum (zero is hanging straight   down).  We will use $\bq =
    [x,\theta]^T$, and $\bx = [\bq,\dot\bq]^T$.   The task is to stabilize the
    unstable fixed point at $\bx = [0,\pi,0,0]^T.$ </p>

    <subsection><h1>Equations of motion</h1>

      <p>
      The kinematics of the system are given by
      \begin{equation}\bx_1 = \begin{bmatrix} x \\ 0
      \end{bmatrix}, \quad \bx_2 = \begin{bmatrix} x + l\sin\theta \\
      -l\cos\theta \end{bmatrix}. \end{equation}
      The energy is given by
      \begin{align} T=& \frac{1}{2} (m_c + m_p)\dot{x}^2 + m_p
      \dot{x}\dot\theta l \cos{\theta} + \frac{1}{2}m_p l^2 \dot\theta^2 \\
      U =& -m_p g l \cos\theta. \end{align}
      The Lagrangian yields the equations of motion:
      \begin{gather}
      (m_c + m_p)\ddot{x} + m_p l \ddot\theta \cos\theta - m_p l \dot\theta^2 \sin\theta = f \\
      m_p l \ddot{x} \cos\theta + m_p l^2 \ddot\theta + m_p g l \sin\theta = 0
      \end{gather}
      In standard, manipulator equation form:
      $$\bM(\bq)\ddot\bq + \bC(\bq,\dot\bq)\dot\bq = \btau_g(\bq) +
      \bB\bu,$$ using $\bq = [x,\theta]^T$, $\bu = f$, we have:
      \begin{gather*}
      \bM(\bq) = \begin{bmatrix} m_c + m_p & m_p l \cos\theta \\ m_p l
        \cos\theta & m_p l^2 \end{bmatrix}, \quad \bC(\bq,\dot{\bq}) =
      \begin{bmatrix} 0 & -m_p l \dot\theta \sin\theta \\ 0 & 0
      \end{bmatrix}, \\
      \btau_g(\bq) = \begin{bmatrix} 0 \\ - m_p gl \sin\theta \end{bmatrix},
      \quad \bB = \begin{bmatrix} 1 \\ 0 \end{bmatrix}
      \end{gather*}
      In this case, it is particularly easy to solve directly for the accelerations:
      \begin{align}
      \ddot{x} =& \frac{1}{m_c + m_p \sin^2\theta}\left[ f+m_p \sin\theta (l \dot\theta^2 + g\cos\theta)\right] \label{eq:ddot_x}\\
      \ddot{\theta} =& \frac{1}{l(m_c + m_p \sin^2\theta)} \left[ -f
       \cos\theta - m_p l \dot\theta^2 \cos\theta \sin\theta - (m_c + m_p) g \sin\theta \right] \label{eq:ddot_theta}
      \end{align}
      In some of the analysis that follows, we will study the form of
      the equations of motion, ignoring the details, by arbitrarily setting
      all constants to 1:
      \begin{gather}
      2\ddot{x} + \ddot\theta \cos\theta - \dot\theta^2 \sin\theta = f \label{eq:simple}\\
      \ddot{x}\cos\theta + \ddot\theta + \sin\theta = 0. \label{eq:simple2}
      \end{gather}
      </p>

      <example><h1>The Cart-Pole System in
      Python</h1>

        <p>You can experiment with the Cart-Pole dynamics in <drake></drake>
        using, e.g.</p>

        <jupyter>examples/cartpole/constant_force.ipynb</jupyter>

      </example>

    </subsection> <!-- end eq of motion -->

  </section> <!-- end cart-pole -->

  <section><h1>Quadrotors</h1>

    <p>Quadrotors have become immensely popular over the last few years --
    advances in outrunner motors from the hobby community made them powerful,
    light-weight, and inexpensive!  They are strong enough to carry an
    interesting payload (e.g. of sensors for mapping / photographing the
    environment), but dynamic enough to move relatively quickly.  The most
    interesting dynamics start showing up at higher speeds, when the propellors
    start achieving lift from the airflow across them due to horizontal motion,
    or when they are close enough to the ground to experience significant
    ground-effect, but we won't try to capture those effects (yet) here.</p>

    <p>When the quadrotor revolution started to happen, I predicted that it
    would be followed quickly by a realization that fixed-wing vehicles are
    better for most applications.  Propellors are almost optimally efficient
    for producing thrust -- making quadrotors very efficient for hovering --
    but to be efficient in forward flight you probably want to have an airfoil.
    Wings are a really good idea!  But I was wrong -- quadrotors have
    completely dominated fixed-wings for commercial UAVs.  Perhaps it's only
    because they are easier to control?  Maybe there is still hope...</p>

    <subsection id="planar_quadrotor"><h1>The Planar Quadrotor</h1>

      <p>We can get started with an extremely simple model of a quadrotor that
      is restricted to live in the plane.  In that case, it actually only needs
      two propellors, but calling it a "birotor" doesn't have the same ring to
      it.  The equations of motion are almost trivial, since it is only a
      single rigid body, and certainly fit into our standard manipulator
      equations:
        \begin{gather}
        m \ddot{x} = -(u_1 + u_2)\sin\theta, \label{eq:quad_x}\\
        m \ddot{y} = (u_1 + u_2)\cos\theta - mg, \label{eq:quad_y}\\
        I \ddot\theta = r (u_1 - u_2) \label{eq:quad_theta}
        \end{gather}
      </p>

      <figure> <img width="50%" src="figures/quadrotor2d.svg"/>
      <figcaption>The Planar Quadrotor System (which we also refer to in the
      code as "Quadrotor2D", to keep it next to "Quadrotor" in the directory
      listing). The model parameters are mass, $m$, moment of inertia, $I$, and
      the distance from the center to the base of the propellor, $r$.
      </figcaption> </figure>


    </subsection>

    <subsection><h1>The Full 3D Quadrotor</h1>

      <p>See <code>drake/examples/Quadrotor</code>.  Will fill in the text
      description/derivation here soon!  The most interesting part is that the
      model needs to include the moment produced by the rotating props,
      otherwise the system linearized about the hovering configuration is
      actually not controllable.</p>

    </subsection>

  </section> <!-- end quadrotor2d -->

  <section><h1>Balancing</h1>

    <p> For both the Acrobot and the Cart-Pole systems, we will begin by
    designing a linear controller which can balance the system when it begins in
    the vicinity of the unstable fixed point.  To accomplish this, we will
    linearize the nonlinear equations about the fixed point, examine the
    controllability of this linear system, then using <a
    href="lqr.html">linear quadratic regulator (LQR)</a> theory to design
    our feedback controller.</p>

    <subsection><h1>Linearizing the manipulator equations</h1>

      <p>Although the equations of motion of both of these model systems are
      relatively tractable, the forward dynamics still involve quite a few
      nonlinear terms that must be considered in any linearization.  Let's
      consider the general problem of linearizing a system described by the
      manipulator equations.</p>

      <p>We can perform linearization around a fixed point, $(\bx^*, \bu^*)$,
      using a Taylor expansion:

      \begin{equation}
      \dot\bx = {\bf f}(\bx,\bu) \approx {\bf f}(\bx^*,\bu^*) + \left[ \pd{\bf
        f}{\bx}\right]_{\bx=\bx^*,\bu=\bu^*} (\bx - \bx^*) + \left[ \pd{\bf
        f}{\bu}\right]_{\bx=\bx^*,\bu=\bu^*} (\bu - \bu^*)
      \end{equation}

      Let us consider the specific problem of linearizing the manipulator
      equations around a (stable or unstable) fixed point.  In this case, ${\bf
      f}(\bx^*,\bu^*)$ is zero, and we are left with the standard linear
      state-space form:

      \begin{align}
      \dot\bx =& \begin{bmatrix} \dot\bq \\ \bM^{-1}(\bq) \left[ \btau_g(\bq) + {\bf
            B}(\bq)\bu - \bC(\bq,\dot\bq)\dot\bq \right]
            \end{bmatrix},\\
      \approx& {\bf A}_{lin} (\bx-\bx^*) + \bB_{lin} (\bu - \bu^*),
      \end{align}

      where ${\bf A}_{lin}$, and $\bB_{lin}$ are constant matrices.  Let us
      define $\bar\bx = \bx - \bx^*, \bar\bu = \bu - \bu^*$, and write
      $$\dot{\bar\bx} = {\bf A}_{lin}\bar\bx + \bB_{lin}\bar\bu.$$ Evaluation of
      the Taylor expansion around a fixed point yields the following, very
      simple equations, given in block form by:

      \begin{align}
      {\bf A}_{lin} =& \begin{bmatrix} {\bf 0} & {\bf I} \\ \bM^{-1} \pd{\btau_g}{\bq} + \sum_{j} \bM^{-1}\pd{\bB_j}{\bq} u_j  & {\bf 0} \end{bmatrix}_{\bx=\bx^*,\bu=\bu^*} \\
      \bB_{lin} =& \begin{bmatrix} {\bf 0} \\ \bM^{-1} \bB
        \end{bmatrix}_{\bx=\bx^*, \bu=\bu^*}
      \end{align}

      where $\bB_j$ is the $j$th column of $\bB$.  Note that the term involving
      $\pd{\bM^{-1}}{q_i}$ disappears because $\btau_g + \bB\bu - \bC\dot{\bq}$
      must be zero at the fixed point.  All of the $\bC\dot\bq$ derivatives drop
      out, too, because $\dot{\bq}^* = 0$, and any terms with $\bC$ drop out as
      well, since centripetal and centrifugal forces are zero at when velocity
      is zero.  In many cases, including both the Acrobot and Cart-Pole systems
      (but not the Quadrotors), the matrix $\bB(\bq)$ is a constant, so the
      $\pd\bB{\bq}$ terms also drop out. </p>

      <example><h1>Linearization of the Acrobot</h1>

        <p>Linearizing around the (unstable) upright point, we have:

        \begin{gather}
        \left[\pd{\bf \tau_g}{\bq}\right]_{\bx=\bx^*} = \begin{bmatrix} g (m_1 l_{c1}
            + m_2 l_1 + m_2 l_{c2}) & m_2 g l_{c2} \\ m_2 g l_{c2} & m_2 g l_{c2}
        \end{bmatrix}
        \end{gather}

        The linear dynamics follow directly from these equations and the manipulator
        form of the Acrobot equations.</p>

      </example>

      <example><h1>Linearization of the Cart-Pole System</h1>

      <p>Linearizing around the unstable fixed point in this system, we have:

        \begin{gather}
        \left[\pd{\btau_g}{\bq}\right]_{\bx=\bx^*} = \begin{bmatrix} 0 & 0 \\ 0
          & m_p g l \end{bmatrix}
        \end{gather}

        Again, the linear dynamics follow simply.</p>

      </example>

      <p>Studying the properties of the linearized system can tell us some
      things about the (local) properties of the nonlinear system.  For
      instance, having a stable linearization implies local exponential
      stability of the nonlinear system<elib>Slotine90+Hauser94a</elib>.   It's
      worth noting that having an unstable linearization also implies that the
      system is locally unstable, but if the linearization is marginally stable
      then one cannot conclude whether the nonlinear system is asymptotically
      stable, stable i.s.L., or unstable (only that it is not exponentially
      stable)<elib>Slotine90</elib>.</p>

    </subsection> <!-- end linearizing -->

    <subsection id="controllability"><h1>Controllability of linear systems</h1>

      <definition><h1>Controllability</h1>     A control system
      of the form $$\dot{\bx} = {\bf f}(\bx,\bu)$$ is called
      <em>controllable</em> if it is possible to construct an unconstrained
      input signal, $\bu(t)$, $t \in [0,t_f],$ which will move the system from
      any initial state to any final state in a finite interval of time, $0 < t
      < t_f$ <elib>Ogata96</elib>. </definition>

      <p> For the linear system $$\dot{\bx} = {\bf A}\bx + \bB\bu,$$ we can
      integrate this linear system in closed form, so it is possible to derive
      the exact conditions of controllability.  In particular, for linear
      systems it is sufficient to demonstrate that there exists a control input
      which drives any initial condition to the origin. </p>

      <subsubsection><h1>The special case of non-repeated
      eigenvalues</h1>

        <p> Let us first examine a special case, which falls short as a general
        tool but may be more useful for understanding the intuition of
        controllability. Let's perform an eigenvalue analysis of the system
        matrix ${\bf A}$, so that: $${\bf A}{\bf v}_i = \lambda_i {\bf v}_i,$$
        where $\lambda_i$ is the $i$th eigenvalue, and ${\bf v}_i$ is the
        corresponding (right) eigenvector. There will be $n$ eigenvalues for the
        $n \times n$ matrix ${\bf A}$. Collecting the (column) eigenvectors into
        the matrix ${\bf V}$ and the eigenvalues into a diagonal matrix ${\bf
        \Lambda}$, we have $${\bf A}{\bf V} = {\bf V}{\bf \Lambda}.$$ Here comes
        our primary assumption: let us assume that each of these $n$ eigenvalues
        takes on a distinct value (no repeats). With this assumption, it can be
        shown that the eigenvectors ${\bf v}_i$ form a linearly independent
        basis set, and therefore ${\bf V}^{-1}$ is well-defined.</p>

        <p> We can continue our eigenmodal analysis of the linear system by
        defining the modal coordinates, ${\bf r}$, with: $$\bx = {\bf V}{\bf
        r},\quad \text{or}\quad {\bf r} = {\bf V}^{-1}\bx.$$  In modal
        coordinates, the dynamics of the linear system are given by $$\dot{\bf
        r} = {\bf V}^{-1} {\bf A} {\bf V} {\bf r} + {\bf V}^{-1} \bB \bu = {\bf
        \Lambda} {\bf r} + {\bf V}^{-1}\bB \bu.$$  This illustrates the power of
        modal analysis; in modal coordinates, the dynamics diagonalize yielding
        independent linear equations: $$\dot{r}_i = \lambda_i r_i + \sum_j
        \beta_{ij} u_j,\quad {\bf \beta} = {\bf V}^{-1} \bB.$$</p>

        <p> Now the concept of controllability becomes clear.  Input $j$ can
        influence the dynamics in modal coordinate $i$ if and only if ${\bf
        \beta}_{ij} \neq 0$.  In the special case of non-repeated eigenvalues,
        having control over each individual eigenmode is sufficient to (in
        finite time) regulate all of the eigenmodes<elib>Ogata96</elib>.
        Therefore, we say that the system is controllable if and only if
        $$\forall i, \exists j \text{ such that }\beta_{ij} \neq 0.$$</p>

      </subsubsection>

      <subsubsection><h1>A general solution</h1>

        <details><summary>Included only for completeness.  Click to expand the details.</summary>

        <p> A more general solution to the controllability issue, which removes
        our assumption about the eigenvalues, can be obtained by examining the
        time-domain solution of the linear equations.  The solution of this
        system is $$\bx(t) = e^{{\bf A}t} \bx(0) + \int_0^{t} e^{{\bf A}(t -
        \tau)} \bB \bu(\tau) d\tau.$$ Without loss of generality, lets consider
        the that the final state of the system is zero.  Then we have: $$\bx(0)
        = - \int_0^{t_f} e^{-{\bf A}\tau}\bB \bu(\tau) d\tau.$$ You might be
        wondering what we mean by $e^{{\bf A}t}$; a scalar raised to the power
        of a matrix..?  Recall that $e^{z}$ is actually defined by a convergent
        infinite sum: $$e^{z} =1 + z + \frac{1}{2} z^2 + \frac{1}{6} z^3 + ...
        .$$ The notation $e^{{\bf A}t}$ uses the same definition: $$e^{{\bf A}t}
        = {\bf I} + {\bf A}t + \frac{1}{2}({\bf A}t)^2 + \frac{1}{6}({\bf A}t)^3
        + ... .$$ Not surprisingly, this has many special forms.  For instance,
        if ${\bf A}$ is diagonalizable, $e^{{\bf A}t} = {\bf
        V}e^{{\bf\Lambda}t}{\bf V}^{-1},$ where ${\bf A} = {\bf V \Lambda
        V}^{-1}$ is the eigenvalue decomposition of ${\bf A}$
        <elib>Strang98</elib>.  The particular form we will use here is
        $$e^{-{\bf A}\tau} = \sum_{k=0}^{n-1} \alpha_k(\tau) {\bf A}^k.$$ This
        is a particularly surprising form, because the infinite sum above is
        represented by this finite sum; the derivation uses Sylvester's Theorem
        <elib>Ogata96</elib>, <elib>Chen98a</elib>.  Then we have,
        \begin{align*} \bx(0) =& - \sum_{k=0}^{n-1} {\bf A}^k \bB \int_0^{t_f}
        \alpha_k(\tau) \bu(\tau) d\tau \\ =& -\sum_{k=0}^{n-1} {\bf A}^k \bB w_k
        \text{, where } \bw_k = \int_0^{t_f} \alpha_k(\tau) \bu(\tau) d\tau \\ =& -
        \begin{bmatrix} \bB & {\bf AB} & {\bf A}^2\bB & \cdots & {\bf
        A}^{n-1}\bB \end{bmatrix}_{n \times (nm)} \begin{bmatrix} \bw_0 \\ \bw_1 \\
        \bw_2 \\ \vdots \\ \bw_{n-1} \end{bmatrix} \end{align*} The matrix
        containing the vectors $\bB$, ${\bf AB}$, ... ${\bf A}^{n-1}\bB$ is
        called the controllability matrix.  In order for the system to be
        controllable, for every initial condition $\bx(0)$ we must be able to
        find the corresponding vector ${\bf w}$.  This is only possible when the
        rows of the controllability matrix are linearly independent.  Therefore,
        the condition of controllability is that this controllability matrix is
        full rank.</p>

        <p> Although we only treated the case of a scalar $u$, it is possible to
        extend the analysis to a vector $\bu$ of size $m$, yielding the
        condition $$\text{rank} \begin{bmatrix} \bB & {\bf AB} & {\bf A}^2\bB &
        \cdots & {\bf A}^{n-1}\bB \end{bmatrix}_{n \times (nm)} = n.$$ In Matlab
        (using the Control Systems Toolbox), you can obtain the controllability
        matrix using <code>Cm = ctrb(A,B)</code>, and evaluate its rank with
        <code>rank(Cm)</code>.</p>

        <p>Note that a linear feedback to change the eigenvalues of the
        eigenmodes is not sufficient to accomplish our goal of getting to the
        goal in finite time.  In fact, the open-loop control to reach the goal
        is easily obtained with a final-value LQR problem, and (for ${\bf
        R}={\bf I}$) is actually a simple function of the controllability
        Grammian<elib>Chen98a</elib>.</p>
        </details>

        <p></p>

      </subsubsection>

      <subsubsection><h1>Controllability vs. underactuated</h1>

        <p> Analysis of the controllability of both the Acrobot and Cart-Pole
        systems reveals that the linearized dynamics about the upright are, in
        fact, controllable.  This implies that the linearized system, if started
        away from the zero state, can be returned to the zero state in finite
        time.  This is potentially surprising - after all the systems are
        underactuated.  For example, it is interesting and surprising that the
        Acrobot can balance itself in the upright position without having a
        shoulder motor.</p>

        <p> The controllability of these model systems demonstrates an extremely
        important, point: <em>An underactuated system is not necessarily an
        uncontrollable system.</em>  Underactuated systems cannot follow
        arbitrary trajectories, but that does not imply that they cannot arrive
        at arbitrary points in state space.  However, the trajectory required to
        place the system into a particular state may be arbitrarily complex.</p>

        <p> The controllability analysis presented here is for linear
        time-invariant (LTI) systems.  A comparable analysis exists for linear
        time-varying (LTV) systems.  We will even see extensions to nonlinear
        systems; although it will often be referred to by the synonym of
        "reachability" analysis.</p>

      </subsubsection>

    </subsection> <!-- end controllability -->

    <subsection><h1>LQR feedback</h1>

      <p> Controllability tells us that a trajectory to the fixed point exists,
      but does not tell us which one we should take or what control inputs cause
      it to occur.  Why not?  There are potentially infinitely many solutions.
      We have to pick one.</p>

      <p> The tools for controller design in linear systems are very advanced.
      In particular, as we describe in the <a href="lqr.html">linear
      optimal control chapter</a>, one can easily design an optimal feedback
      controller for a regulation task like balancing, so long as we are
      willing to linearize the system around the operating point and define
      optimality in terms of a quadratic cost function: $$J(\bx_0) =
      \int_0^\infty \left[ \bx^T(t) \bQ \bx(t) + \bu^T(t) \bR \bu(t) \right]dt,
      \quad \bx(0)=\bx_0, \bQ=\bQ^T>0, \bR=\bR^T>0.$$ The linear feedback
      matrix $\bK$ used as $$\bu(t) = - \bK\bx(t),$$ is the so-called optimal
      linear quadratic regulator (LQR).  Even without understanding the
      detailed derivation, we can quickly become practitioners of LQR.
      <drake></drake> provides a function, <blockquote><code>K =
      LinearQuadraticRegulator(A, B, Q, R)</code></blockquote> for linear
      systems.  It also provides a version <blockquote><code>controller =
      LinearQuadraticRegulator(system, context, Q, R)</code></blockquote> that
      will linearize the system for you around an equilibrium and return the
      controller (in the original coordinates).  Therefore, to use LQR, one
      simply needs to define the symmetric positive-definite cost matrices,
      ${\bf Q}$ and ${\bf R}$. In their most common form, ${\bf Q}$ and ${\bf
      R}$ are positive diagonal matrices, where the entries $Q_{ii}$ penalize
      the relative errors in state variable $x_i$ compared to the other state
      variables, and the entries $R_{ii}$ penalize actions in $u_i$.</p>

      <example><h1>LQR for the Acrobot and Cart-Pole</h1>

        <p>Take a minute to play around with the LQR controller for the
        Acrobot and the Cart-Pole</p>

        <jupyter>examples/lqr.ipynb</jupyter>

        <p>Make sure that you take a minute to look at the code which runs
        during these examples.  Can you set the ${\bf Q}$ and ${\bf R}$ matrices
        differently, to improve the performance?</p>

      </example>

      <p> Simulation of the closed-loop response with LQR feedback shows that
      the task is indeed completed - and in an impressive manner.  Oftentimes
      the state of the system has to move violently away from the origin in
      order to ultimately reach the origin.  Further inspection reveals the
      (linearized) closed-loop dynamics are in fact non-minimum phase (acrobot
      has 3 right-half zeros, cart-pole has 1).</p>


      <example><h1>LQR for Quadrotors</h1>

        <p>LQR works essentially out of the box for Quadrotors, if linearized
        around a nominal fixed point (where the non-zero thrust from the
        propellers is balancing gravity).</p>

        <jupyter>examples/lqr.ipynb</jupyter>
        <p>or <a target="_blank" href="data/quadrotor2d_lqr.html">Click here for
        the animation</a>.</p>

      </example>

      <p> Note that LQR, although it is optimal for the linearized system, is
      not necessarily the best linear control solution for maximizing basin of
      attraction of the fixed-point.  The theory of <em>robust
      control</em>(e.g., <elib>Zhou97</elib>), which explicitly takes into
      account the differences between the linearized model and the nonlinear
      model, will produce controllers which outperform our LQR solution in this
      regard.</p>

    </subsection>

  </section> <!-- end balancing -->

  <section><h1>Partial feedback linearization</h1>

    <p> In the introductory chapters, we made the point that the underactuated
    systems are not feedback equivalent to $\ddot{q} = \bu$.   Although we
    cannot always simplify the full dynamics of the system, it is still possible
    to linearize a portion of the system dynamics.  The technique is called
    <em>partial feedback linearization</em>.</p>

    <p> Consider the cart-pole example.  The dynamics of the cart are affected
    by the motions of the pendulum.  If we know the model, then it seems quite
    reasonable to think that we could create a feedback controller which would
    push the cart in exactly the way necessary to counter-act the dynamic
    contributions from the pendulum - thereby linearizing the cart dynamics.
    What we will see, which is potentially more surprising, is that we can also
    use a feedback controller for the cart to feedback linearize the dynamics
    of the passive pendulum joint.</p>

    <p> We'll use the term <em>collocated</em> partial feedback linearization to
    describe a controller which linearizes the dynamics of the actuated joints.
    What's more surprising is that it is often possible to achieve
    <em>non-collocated</em> partial feedback linearization - a controller which
    linearizes the dynamics of the unactuated joints. The treatment presented
    here follows from <elib>Spong94a</elib>.</p>

    <subsection><h1>PFL for the Cart-Pole System</h1>

      <subsubsection><h1>Collocated</h1>

        <p> Starting from the equations \ref{eq:simple} and \ref{eq:simple2}, we
        have

          \begin{gather*}
          \ddot\theta = -\ddot{x}c - s \\
          % \ddot\theta = -\frac{1}{l} (\ddot{x} c + g s)
          \ddot{x}(2-c^2) - sc - \dot\theta^2 s = f
          \end{gather*}
          Therefore, applying the feedback control
          \begin{equation}
          f = (2 - c^2) \ddot{x}^d - sc - \dot\theta^2 s
          % f = (m_c + m_p) u + m_p (u c + g s) c - m_p l \dot\theta^2 s
          \end{equation}
          results in
          \begin{align*}
          \ddot{x} =& \ddot{x}^d \\ \ddot{\theta} =& -\ddot{x}^dc - s,
          \end{align*}

        which are valid globally.</p>
        
        <p>Look carefully at the resulting equations.  Of course, it says that we can impose whatever accelerations we like on the cart.  But even the resulting equations of the pole happen to take a nice form here:  they have been reduced to the equations of the simple pendulum (without a cart), where the torque input is now given instead by $\ddot{x}c$.  It's as if we have a simple pendulum with torque control, except our command is modulated by a $\cos\theta$ term, and this $\cos\theta$ term is fundamental -- it's true that our control authority goes to zero when the pole is horizontal, because no amount of force on the cart in that configuration can act like a torque on the pole.</p>

      </subsubsection>

      <subsubsection><h1>Non-collocated</h1>

        <p> Starting again from equations \ref{eq:simple} and \ref{eq:simple2},
        we have

        \begin{gather*}
        \ddot{x} = -\frac{\ddot\theta + s}{c} \\
        \ddot\theta(c - \frac{2}{c}) - 2 \tan\theta - \dot\theta^2 s = f
        \end{gather*}

        Applying the feedback control

        \begin{equation}
          f = (c - \frac{2}{c}) \ddot\theta^d - 2 \tan\theta - \dot\theta^2 s
        \end{equation} results in
        \begin{align*}
          \ddot\theta =& \ddot\theta^d \\ \ddot{x} =&
          -\frac{1}{c} \ddot\theta^d - \tan\theta.
        \end{align*}

        Note that this expression is only valid when $\cos\theta \neq 0$. Once again, we know that the force cannot create a torque when
        the pole is perfectly horizontal.  In fact, the controller we have written will "blow-up" -- requesting infinite force at $\cos\theta = 0$; so make sure you saturate the command before you implement it on hardware (or even in simulation).  Although it may come as a small concellation, at least we have that $(c - \frac{2}{c})$ never goes to zero; in fact you can check for yourself that $|c - \frac{2}{c}| \ge 1$.</p>

      </subsubsection>

    </subsection> <!-- end PFL for the cart-pole -->

    <subsection><h1>General form</h1>

      <p> For systems that are trivially underactuated (torques on some joints,
      no torques on other joints), we can, without loss of generality,
      reorganize the joint coordinates in any underactuated system described by
      the manipulator equations into the form:

      \begin{align}
      \bM_{11} \ddot{\bq}_1 + \bM_{12} \ddot{\bq}_2 &= \btau_1, \label{eq:passive_dyn}\\
      \bM_{21} \ddot{\bq}_1 + \bM_{22} \ddot{\bq}_2 &= \btau_2 + \bu, \label{eq:active_dyn}
      \end{align}

      with $\bq \in \Re^n$, $\bq_1 \in \Re^l$, $\bq_2 \in \Re^m$, $l=n-m$.
      $\bq_1$ represents all of the passive joints, and $\bq_2$ represents all
      of the actuated joints, and the $\btau = \btau_g - \bC\dot\bq$ terms
      capture all of the Coriolis and gravitational terms, and $$\bM(\bq) =
      \begin{bmatrix} \bM_{11} & \bM_{12} \\ \bM_{21} & \bM_{22}
      \end{bmatrix}.$$ Fortunately, because $\bM$ is uniformly (e.g. for all
      $\bq$) positive definite, $\bM_{11}$ and $\bM_{22}$ are also positive
      definite, by the <a href="https://en.wikipedia.org/w/index.php?title=Schur_complement">Schur complement condition for positive definiteness</a>.</p>

      <subsubsection><h1>Collocated linearization</h1>

        <p> Performing the same substitutions into the full manipulator
        equations, we get:

        \begin{gather}
        \ddot\bq_1 = \bM_{11}^{-1} \left[ \btau_1 - \bM_{12} \ddot\bq_2
          \right] \\
        (\bM_{22} - \bM_{21} \bM_{11}^{-1} \bM_{12})
          \ddot\bq_2 - \btau_2 + \bM_{21} \bM_{11}^{-1} \btau_1 = \bu
        \end{gather}

        It can be easily shown that the matrix $(\bM_{22} - \bM_{21}
        \bM_{11}^{-1} \bM_{12})$ is invertible<elib>Spong94a</elib>; we can see
        from inspection that it is symmetric.  PFL follows naturally, and is
        valid globally.</p>

      </subsubsection>

      <subsubsection><h1>Non-collocated linearization</h1>

        <p>
        \begin{gather}
        \ddot\bq_2 = \bM_{12}^+ \left[ \btau_1 - \bM_{11} \ddot\bq_1
          \right] \\
        (\bM_{21} - \bM_{22} \bM_{12}^+ \bM_{11})
          \ddot\bq_1 - \btau_2 + \bM_{22} \bM_{12}^+ \btau_1 = \bu
        \end{gather}

        where $\bM_{12}^+$ is a Moore-Penrose pseudo-inverse.  This inverse
        provides a unique solution when the rank of $\bM_{12}$ equals $l$, the
        number of passive degrees of freedom in the system (it cannot be more,
        since the matrix only has $l$ rows).  The rank condition in this 
        context is sometimes called the property of "Strong Inertial Coupling".  It is state
        dependent; in the cart-pole example above $\bM_{12} = \cos\theta$ and drops rank exactly when $\cos\theta = 0$. A system has Global Strong Inertial Coupling if it exhibits
        Strong Inertial Coupling in every state.</p>

      </subsubsection>

      <subsubsection id="task_space_pfl"><h1>Task-space partial feedback linearization</h1>

        <p> In general, we can define some combination of active and passive
        joints that we would like to control.  This combination is sometimes
        called a "task space". <todo>cite some task space refs here?</todo>
        Consider an output function of the form, $$\by = \bh(\bq),$$ with ${\bf
        y} \in \Re^p$, which defines the task space.  Define $\bH_1 =
        \frac{\partial \bh}{\partial \bq_1}$, $\bH_2 = \frac{\partial
        \bh}{\partial \bq_2}$, $\bH = [\bH_1,\bH_2]$.</p>

        <theorem><h1>Task Space PFL</h1>

        <p> If the actuated joints are commanded so that \begin{equation}
        \ddot\bq_2 = \bar\bH^+ \left [\ddot\by^d - \dot\bH\dot\bq - \bH_1
        \bM_{11}^{-1}\btau_1 \right],  \label{eq:q2cmd} \end{equation} where
        $\bar{\bH} = \bH_2 - \bH_1 \bM_{11}^{-1} \bM_{12}.$ and $\bar\bH^+$ is
        the right Moore-Penrose pseudo-inverse, $$\bar\bH^+ = \bar\bH^T (\bar\bH
        \bar\bH^T)^{-1},$$ then we have \begin{equation} \ddot\by =
        \ddot\by^d.\end{equation} subject to \begin{equation}\text{rank}\left(\bar{\bH}
        \right) = p, \label{eq:rank_condition}\end{equation} </p>

        <p><b>Proof Sketch.</b>
        Differentiating the output function we have \begin{gather*} \dot\by =
        \bH \dot\bq \\ \ddot\by = \dot\bH \dot\bq + \bH_1 \ddot\bq_1 + \bH_2
        \ddot\bq_2. \end{gather*} Solving \ref{eq:passive_dyn} for the dynamics
        of the unactuated joints we have: \begin{equation} \ddot\bq_1 =
        \bM_{11}^{-1} (\btau_1 - \bM_{12} \ddot\bq_2) \label{eq:q1cmd}
        \end{equation} Substituting, we have \begin{align} \ddot\by =& \dot\bH
        \dot\bq + \bH_1 \bM_{11}^{-1}(\btau_1 - \bM_{12}\ddot\bq_2) + \bH_2
        \ddot\bq_2 \\ =& \dot\bH \dot\bq + \bar{\bH} \ddot\bq_2 + \bH_1
        \bM_{11}^{-1}\btau_1 \\ =& \ddot\by^d \end{align} Note that the last
        line required the rank condition ($\ref{eq:rank_condition}$) on
        $\bar\bH$ to ensure that the rows of $\bar{\bH}$ are linearly
        independent, allowing $\bar\bH \bar\bH^+ = \bI$.</p>

      </theorem>

        <p> In order to execute a task space trajectory one could command
        $$\ddot\by^d = \ddot{\bar{\by}}^d + \bK_d (\dot{\bar{\by}}^d - \dot\by)
        + \bK_p (\bar{\by}^d -\by).$$ Assuming the internal dynamics are stable,
        this yields converging error dynamics, $(\bar{\by}^d - \by)$, when
        $\bK_p,\bK_d > 0$<elib>Slotine90</elib>.  For a position control robot,
        the acceleration command of ($\ref{eq:q2cmd}$) suffices. Alternatively,
        a torque command follows by substituting ($\ref{eq:q2cmd}$) and
        ($\ref{eq:q1cmd}$) into ($\ref{eq:active_dyn}$).</p>

        <example><h1>End-point trajectory following with the Cart-Pole system</h1>

        <p> Consider the task of trying to track a desired kinematic trajectory
        with the endpoint of pendulum in the cart-pole system.  With one
        actuator and kinematic constraints, we might be hard-pressed to track a
        trajectory in both the horizontal and vertical coordinates.  But we can
        at least try to track a trajectory in the vertical position of the
        end-effector.</p>

        <p>Using the task-space PFL derivation, we have:
        \begin{gather*}
        y = h(\bq) = -l \cos\theta \\
        \dot{y} = l \dot\theta \sin\theta
        \end{gather*}
        If we define a desired trajectory:
        $$\bar{y}^d(t) = \frac{l}{2} + \frac{l}{4} \sin(t),$$
        then the task-space controller is easily implemented using the
        derivation above.</p>

        <todo>add results here</todo>
        </example> 

        <p> The task space derivation above provides a convenient generalization
        of the partial feedback linearization (PFL) <elib>Spong94a</elib>, which
        encompasses both the collocated and non-collocated results.  If we
        choose $\by = \bq_2$ (collocated), then we have $$\bH_1 = 0, \bH_2 =
        \bI, \dot\bH = 0, \bar\bH = \bI, \bar\bH^+ = \bI.$$ From this, the
        command in ($\ref{eq:q2cmd}$) reduces to $\ddot\bq_2 = \ddot\bq_2^d$.
        The actuator command is then $$ \bu = \bM_{21} \bM_{11}^{-1} (\btau_1 -
        \bM_{12} \ddot\bq_2^d) + \bM_{22} \ddot\bq_2^d + \btau_2,$$ and the rank
        condition ($\ref{eq:rank_condition}$) is always met.</p>

        <p> If we choose ${\bf y} = \bq_1$ (non-collocated), we have $$\bH_1 =
        \bI, \bH_2 = 0, \dot\bH = 0, \bar{\bH}=-\bM_{11}^{-1}\bM_{12}.$$ The
        rank condition ($\ref{eq:rank_condition}$) requires that
        $\text{rank}(\bM_{12}) = l$, in which case we can write
        $\bar{\bH}^+=-\bM_{12}^+\bM_{11}$, reducing the rank condition to
        precisely the "Strong Inertial Coupling" condition described in
        <elib>Spong94a</elib>. Now the command in ($\ref{eq:q2cmd}$) reduces to
        \begin{equation} \ddot\bq_2 = \bM_{12}^+ \left [ \btau_1 -
        \bM_{11}\ddot\bq_1^d \right] \label{eq:q2ddnonco} \end{equation} The
        actuator command is found by substituting ($\ref{eq:q2ddnonco}$) into
        ($\ref{eq:active_dyn}$), yielding the same results as in
        <elib>Spong94a</elib>.</p>

      </subsubsection> <!-- task space PFL -->
    </subsection>
  </section>

  <section><h1>Swing-up control</h1>

    <subsection><h1>Energy shaping</h1>

      <p> Recall in the last chapter, <a href="pend.html#energy_shaping">we used
      energy shaping to swing up the simple pendulum</a>.  This idea turns out
      to be a bit more general than just for the simple pendulum. As we will
      see, we can use similar concepts of "energy shaping" to produce swing-up
      controllers for the acrobot and cart-pole systems. It's important to note
      that it only takes one actuator to change the total energy of a
      system.</p>

      <p> Although a large variety of swing-up controllers have been proposed
      for these model systems (c.f. <elib>Fantoni02</elib>,
      <elib>Araki05</elib>, <elib>Xin04</elib>, <elib>Spong94</elib>,
      <elib>Mahindrakar05</elib>, <elib>Berkemeier99</elib>,
      <elib>Murray91</elib>, <elib>Lai06</elib>), the energy shaping controllers
      tend to be the most natural to derive and perhaps the most well-known.</p>

    </subsection>

    <subsection><h1>Cart-Pole</h1>

      <p> Let's try to apply the energy-shaping ideas to the cart-pole system.
      The basic idea, from <elib>Chung95</elib>, is to use collocated PFL to
      simplify the dynamics, use energy shaping to regulate the pendulum to its
      homoclinic orbit, then to add a few terms to make sure that the cart stays
      near the origin.  This is a bit surprising... if we want to control the
      pendulum, shouldn't we use the non-collocated version?  Actually, we
      ultimately want to control both the cart and the pole, and we will build
      on the collocated version both because it avoids inverting the
      $\cos\theta$ term that can go to zero and because (when all parameters are
      set to 1) we were left with a particularly simple form of the equations:
      \begin{gather} \ddot{x} = u \\ \ddot\theta = - u c - s. %\ddot\theta =
      -\frac{1}{l}( u c + g s ) \end{gather} The first equation is clearly
      simple, but even the second equation is exactly the equations of a
      <i>decoupled</i> pendulum, just with a slightly odd control input that is
      modulated by $\cos\theta$.</p>

      <p>Let us regulate the energy of this decoupled pendulum using energy
      shaping. The energy of the pendulum (a unit mass, unit length, simple
      pendulum in unit gravity) is given by: $$E(\bx) = \frac{1}{2} \dot\theta^2
      - \cos\theta.$$ <!-- E = \frac{1}{2}ml^2\dot\theta^2 - mgl\cos\theta -->
      The desired energy, equivalent to the energy at the desired fixed-point,
      is $$E^d = 1.$$ <!--  E^d = m_p g l --> Again defining $\tilde{E}(\bx) =
      E(\bx) - E^d$, we now observe that
      \begin{align*}
      \dot{\tilde{E}}(\bx) =& \dot{E}(\bx) = \dot\theta \ddot\theta +
      \dot\theta s \\
      % \dot\tilde{E} = ml^2 \dot\theta \ddot\theta + mgl s
      =& \dot\theta [ -uc - s] + \dot\theta s \\
      % - ml \dot\theta [ u c + g s ] + mgl s
      =& - u\dot\theta \cos\theta.
      % - ml u \dot\theta c
      \end{align*}
      </p>

      <p>Therefore, if we design a controller of the form $$u = k
      \dot\theta\cos\theta \tilde{E},\quad k>0$$ the result is $$\dot{\tilde{E}}
      = - k \dot\theta^2 \cos^2\theta \tilde{E}.$$  <!-- times ml --> This
      guarantees that $| \tilde{E} |$ is non-increasing, but isn't quite enough
      to guarantee that it will go to zero.  For example, if $\theta =
      \dot\theta = 0$, the system will never move.  However, if we have that
      $$\int_0^t \dot\theta^2(t') \cos^2 \theta(t') dt' \rightarrow \infty,\quad
      \text{as } t \rightarrow \infty,$$ then we have $\tilde{E}(t) \rightarrow
      0$. This condition, a version of the LaSalle's theorem that we will
      develop in our notes on Lyapunov methods, is satisfied for all but the
      trivial constant trajectories at fixed points.</p>

      <p> Now we return to the full system dynamics (which includes the cart).
      <elib>Chung95</elib> and <elib>Spong96</elib> use the simple pendulum
      energy controller with an addition PD controller designed to regulate the
      cart: $$\ddot{x}^d = k_E \dot\theta \cos\theta \tilde{E} - k_p x - k_d
      \dot{x}.$$ <elib>Chung95</elib> provides a proof of convergence for this
      controller with some nominal parameters.  <elib>Fantoni02</elib> provides
      a slightly different controller derived directly from a Lyapunov
      argument.</p>

      <figure>
      <img width="70%" src="figures/cartpole_swingup_phase.svg"/>

      <figcaption>Cart-Pole Swingup: Example phase plot of the pendulum
      subsystem using energy shaping control.  The controller drives the system
      to the homoclinic orbit, then switches to an LQR balancing controller near
      the top.</figcaption>
      </figure>

    </subsection>

    <subsection><h1>Acrobot</h1>

      <p> Swing-up control for the acrobot can be accomplished in much the same
      way.  <elib>Spong94</elib> - pump energy.  Clean and simple.  No proof.
      Slightly modified version (uses arctan instead of sat) in
      <elib>Spong95</elib>. Clearest presentation in <elib>Spong96</elib>.</p>

      <p> Use collocated PFL.  ($\ddot{q}_2 = \ddot{q}_2^d$). $$E(\bx) =
      \frac{1}{2}\dot\bq^T\bM\dot{\bq} + U(\bx).$$ $$ E_d = U(\bx^*).$$
      $$\bar{u} = \dot{q}_1 \tilde{E}.$$ $$\ddot{q}_2^d  = - k_1 q_2 - k_2
      \dot{q}_2 + k_3 \bar{u},$$</p>

      <p> Extra PD terms prevent proof of asymptotic convergence to homoclinic
      orbit.  Proof of another energy-based controller in
      <elib>Xin04</elib>.</p>

    </subsection>

    <subsection><h1>Discussion</h1>

      <p> The energy shaping controller for swing-up presented here is a pretty
      faithful representative from the field of nonlinear underactuated control.
      Typically these control derivations require some clever tricks for
      simplifying or canceling out terms in the nonlinear equations, then some
      clever Lyapunov function to prove stability.  In these cases, PFL was used
      to simplify the equations, and therefore the controller design.</p>

      <p> These controllers are important, representative, and relevant.  But
      clever tricks with nonlinear equations seem to be fundamentally limited.
      Most of the rest of the material presented in this book will emphasize
      more general computational approaches to formulating and solving these and
      other control problems. </p>

    </subsection>

  </section>

  <section id="differential_flatness"><h1>Differential Flatness</h1>

    <p>Although the "swing up and balance" task is quite interesting for the
    acrobot and cart-pole systems, quadrotors tend to inspire a slightly
    different question.  First of all, stabilizing a hovering fixed point is
    relatively easy on a quadrotor -- even our LQR controller will have an
    extremely large region of attraction.  But people also tend to want to do
    more with their quadrotors than just have them hover in place.  The task
    we'll consider here is trajectory planning: how can you find a feasible
    trajectory through state space for the quadrotor, even if there are
    obstacles to avoid that are only known at runtime? </p>

    <p>Trajectory design, and especially trajectory optimization, is a big idea
    that we will <a href="trajopt.html">explore more
    thoroughly later in the text.</a>  But there is one idea that I would like
    to present here, because in addition to being a very satisfying solution
    for quadrotors, it is philosophically quite close to the idea of partial
    feedback linearization.  That idea is called differential flatness.</p>

    <p>One of the most important lessons from partial feedback linearization,
    is the idea that if you have $m$ actuators, then you basically get to
    control exactly $m$ quantities of your system.  With the "task space"
    partial feedback linearization, we showed that these can be chosen in a
    fairly general way.  Of course there are some requirements -- captured
    above by the rank condition which checks for "inertial coupling".
    Differential flatness exploits this same idea (choosing $m$ outputs), but
    in the opposite direction.  The idea is that, for some systems, if you give
    me a trajectory in those $m$ coordinates, it may in fact dictate what all
    of the states/actuators must have been doing.  The fact that you can only
    execute a subset of possible trajectories can, in this case, make
    trajectory planning much easier!</p>

    <p>Let's start with an example...</p>

    <example><h1>Differential flatness for the Planar Quadrotor</h1>

      <p>The planar quadrotor model described above has 3 degrees of freedom
      ($x,y,\theta$) but only two actuators (one for each propellor).  My claim
      is that, if you give me a trajectory for just the location of the center
      of mass: $x(t), y(t), \forall t \in [t_0, t_f],$ then I will be able to
      infer what $\theta(t)$ must be over that same interval in order to be
      feasible.  Furthermore, I can even infer $\bu(t)$. There is one technical
      condition required: the trajectory you give me for $x(t)$ and $y(t)$
      needs to be continuously differentiable (four times).</p>

      <p>To see this, first observe that from ($\ref{eq:quad_x}$) and
      ($\ref{eq:quad_y}$) we have $$\frac{-m \ddot{x}}{ m \ddot{y} + mg} =
      \frac{(u_1 + u_2)\sin\theta}{(u_1+u_2)\cos\theta} = \tan\theta.$$ In
      words, given $\ddot{x}(t)$ and $\ddot{y}(t)$, I can solve for
      $\theta$(t).  I can differentiate this relationship (in time) twice more
      to obtain $\ddot\theta$.  Using ($\ref{eq:quad_theta}$) with
      ($\ref{eq:quad_x}$) or ($\ref{eq:quad_y}$) give us two linear equations
      for $u_1$ and $u_2$ which are easily solved.</p>

      <p>Now you can see why we need the original trajectories to be smooth --
      the solution to $\bu(t)$ depends on $\ddot\theta(t)$ which depends on
      $\frac{d^4 x(t)}{dt^4}$ and $\frac{d^4 y(t)}{dt^4}$; we need those
      derivatives to exist along the entire trajectory.</p>

      <p>What's more -- if we ignore input limits for a moment -- <i>any</i>
      sufficiently smooth trajectory of $x(t), y(t)$ is feasible, so if I can
      simply find one that avoids the obstacles, then I have designed my state
      trajectory.  As we will see, optimizing even high-degree
      piecewise-polynomials is actually an easy problem (it works out to be a
      quadratic program), assuming the constraints are convex.  In practice,
      this means that once you have determined whether to go left or right
      around the obstacles, trajectory design is easy and fast.</p>

      <p>I've coded up a simple example of that for you here:</p>

      <jupyter>examples/quadrotor2d/differential_flatness.ipynb</jupyter>

      <figure> <img width="100%" src="figures/differential_flatness.svg"/>
      <figcaption>Differential flatness for the planar quadrotor -- by solving
      a simple optimization to find a smooth trajectory for $x(t)$ and $y(t)$,
      I can back out $\theta(t)$ and even $\bu(t)$. </figcaption> </figure>

    </example>

    <p>The example above demonstrates that the planar quadrotor system is
    differentially flat in the outputs $x(t)$, $y(t)$.  More generally, if we
    have a dynamical system $$\dot{\bx} = f(\bx, \bu),$$ and we design some
    output coordinates (essentially "task-space"): $$\bz(t) = h\left(\bx, \bu,
    \frac{d\bu}{dt}, ..., \frac{d^k\bu}{dt^k}\right),$$ such that we can write
    the $\bx$ and $\bu$ purely as a function of the output and it's time
    derivatives, \begin{gather*} \bx (t) = \bx\left(\bz, \frac{d\bz}{dt}, ...,
    \frac{d^k\bz}{dt^k}\right), \\ \bu (t) = \bu\left(\bz, \frac{d\bz}{dt}, ..,
    \frac{d^k\bz}{dt^k}\right),\end{gather*} then we say that the system $f$ is
    differentially flat in the outputs $\bz$ <elib>Murray95</elib>.  And the
    requirement for showing that a system is differentially flat in those
    outputs is to write the function which solves for $\bx(t)$ and $\bu(t)$ as a
    function of only $\bz(t)$ and its time derivatives.</p>

    <p>I'm not aware of any numerical recipes for showing that a system is
    differentially flat nor for finding potential flat outputs, but I admit I
    haven't worked on it nor looked for those recipes.  That would be
    interesting!  I think of differential flatness as a property that one must
    find for your system -- typically via a little mechanical intuition and a
    lot of algebra.  Once found, it can be very useful.</p>

    <example><h1>Differential flatness for the 3D Quadrotor</h1>

      <p>Probably the most famous example of differential flatness is on the
      full 3D quadrotors. <elib>Mellinger11</elib> showed that the 3D quadrotor
      is differentially flat in the outputs $\{x,y,z,\theta_{yaw}\}$.  They used
      this, to dramatic effect, to perform all sorts of acrobatics.  The
      resulting videos are awesome (and probably deserve a lot of credit for the
      widespread popularity of quadrotors in academia and industry over the next
      few years).
      </p>

      <figure>
        <iframe width="560" height="315"
                src="https://www.youtube.com/embed/MvRTALJp8DM" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
        <iframe width="560" height="315"
                src="https://www.youtube.com/embed/geqip_0Vjec" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
        <figcaption>Aggressive quadrotor trajectories using differential
          flatness from <elib>Mellinger11</elib>.</figcaption>
      </figure>

      <p>A few things to note about these examples, just so we also understand
      the limitations.  First, the technique above is great for designing
      trajectories, but additional work is required to stabilizing those
      trajectories (we'll cover that topic in more detail later in the notes).
      Trajectory stabilization benefits greatly from good state estimation, and
      the examples above were all performed in a motion capture arena. Also,
      the "simple" version of the trajectory design that is fast enough to be
      used for flying through moving hoops is restricted to convex optimization
      formulations -- which means one has to hard-code apriori the decisions
      about whether to go left or right / up or down around each obstacle.</p>

    </example>

  </section>

  <section><h1>Other model systems</h1>

    <p> The acrobot and cart-pole systems are just two of the model systems used
    heavily in underactuated control research.  Other examples include:</p> <ul>
    <li>Pendubot</li> <li>Inertia wheel pendulum</li> <li>Furuta pendulum
    (horizontal rotation and vertical pendulum)</li> <li>Hovercraft</li> </ul>


  </section>

  <section><h1>Exercises</h1>

    <exercise><h1>Cart-Pole: Linearization and Balancing</h1>

      <p>For this exercise you will work exclusively in <a href="https://colab.research.google.com/github/RussTedrake/underactuated/blob/master/exercises/acrobot/cartpole_balancing/cartpole_balancing.ipynb" target="_blank">this notebook</a>.  You will be asked to complete the following steps.</p>

      <ol type="a">

        <li>Derive the state-space dynamics $\dot{\bx} = f(\bx, \bu)$ of <a href="#cart_pole">the cart-pole system</a>.</li>

        <li>Linearize the dynamics from point (a) around the unstable equilibrium point.</li>

        <li>Analyze the linearization error for different values of the state $\bx$ and the control $\bu$.</li>

        <li>Among the states from point (c), identify which ones are stabilized to the origin by the LQR controller.</li>

      </ol>

    </exercise>

  </section>

</chapter>
<!-- EVERYTHING BELOW THIS LINE IS OVERWRITTEN BY THE INSTALL SCRIPT -->

<table style="width:100%;"><tr style="width:100%">
  <td style="width:33%;text-align:left;"><a class="previous_chapter" href=pend.html>Previous Chapter</a></td>
  <td style="width:33%;text-align:center;"><a href=index.html>Table of contents</a></td>
  <td style="width:33%;text-align:right;"><a class="next_chapter" href=simple_legs.html>Next Chapter</a></td>
</tr></table>

<div id="footer">
  <hr>
  <table style="width:100%;">
    <tr><td><em>Underactuated Robotics</em></td><td align="right">&copy; Russ
      Tedrake, 2020</td></tr>
  </table>
</div>


</body>
</html>
